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THE METHOD OF MONTHLY MEANS FOR DETERMINA- 
TION OF A SEASONAL VARIATION 

By William L. Hart, University of Minnesota 



1. INTRODUCTION 

Many of the mathematical processes used in statistical analysis are 
of such frequent appHcation and are, relatively, so ancient in their 
origin that one ceases to think of the reasons for their well-known 
validity. When a new process is invented, however, it should be sub- 
jected to close scrutiny to determine the sphere in which its application 
is valid. The present paper deals with two methods which have been 
used for the determination of the seasonal variation in certain statis- 
tical data of economics. Arguments are presented against one method,' 
which is relatively complicated in its numerical details, and a complete 
logical foundation is given for a second method,^ which involves merely 
the formation of monthly means. 

Before considering the particular problems of this paper it is perti- 
nent to decide on suitable criteria for testing the validity of a mathe- 
matical process used in statistical analysis. In the field of pure 
mathematics we may state certain definite hypotheses and, by logical 
reasoning, derive certain equally definite conclusions. It is realized, 
after very brief reflection, that most of the well-established mathemat- 
ical methods of statistics have been taken over bodily from correspond- 
ing problems already solved in the field of pure mathematics. Such 
statistical methods have perfectly defined spheres of usefulness. We 
are usually able to state that if our statistical problem, when couched 
in mathematical terms, satisfies a given set of conditions, then our 
particular method leads us to conclusions, the accuracy of which is 
dependable. Many times, moreover, we are also able to state that 
even when the given set of conditions is not satisfied by our data, never- 
theless the conclusions obtained by our method are the best, of a given 
type, that could possibly be obtained by any method. 

To illustrate the preceding paragraph, consider the method of least 
squares for determining a line y = ai-\-h to fit a set of points in a plane 
containing the rectangular {y, t) axes. This set of points may be con- 
sidered as the graph of a time series, a table of the values of a variable 
quantity y for a sequence of values of the time represented by t. The 

1 Persons, Rev, oS Economic Stat., Jan., 1919, p. 5. 

2E. W. Kemmerer, Seasonal Variation in the Demand for Money and Capital in the United States, 
Report of the National Monetary Commission (1910). 
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following mathematical theorem is the logical basis for our statistical 
method in the present type of problem : 

If the set of points actually lies on a straight line, this is the line which is 
given by the method of least squares. Moreover, if the points do not lie 
on a straight line, that line which we obtain by our method is such that the 
sum of the squares of the distances (measured parallel to the y axis) from 
the given points to our line is less than would be the case for any other 
straight line in the (y, t) plane. 

The determination of the secular trend' in a time series may be baaed 
on this theorem, and our confidence in the results obtained is based 
upon the knowledge that the secular trend we determine is the best 
approximation we can hope for, where the word "best" is used in the 
sense of the method of least squares. 

When a mathematical method of statistics is not based upon a 
theorem from pure mathematics, it is on the same plane as any empirical 
process. It should be subjected to experimental test in a controlled 
problem, where the results which should be obtained are known 
a priori. If a sensible problem can be proposed, where the results are 
known, in which the method in question leads to conclusions at vari- 
ance with the truth, doubt should be cast upon all conclusions obtained 
by use of the method. Moreover, if two methods are available, of 
which one is founded logically and the other only empirically, with a 
shght doubt, perhaps, as to its general validity, it is obvious that the 
logically founded method is to be preferred. In the following discussion 
a logical background is given for the method of monthly means for 
determining seasonal variation. Afterwards an example of an admit- 
ted type is treated by Persons' method, and it is shown that the 
results obtained are widely at variance with the truth. The method 
of monthly means gives the correct result in this example. 

2. THE METHOD OF MONTHLY MEANS 

It is presumed that our data consist of a series of monthly entries 
of the values of some quantity over a series of k consecutive years. It 
will be supposed that any secular change originally present in the data 
has been eliminated, for example, by fitting a straight line to the original 
data by the method of least squares and by then subtracting the ordi- 
nates of this straight line from the corresponding entries in the original 
data. Let f{t) represent the value of our entry t months from the 
zero date. For simplicity of nomenclature < = will be spoken of as 
January of the first year, <=1 as February of the first year, t=l2 as 
January of the second year, t = 26 as March of the third year, etc. 

1 Persons, loc. cit., p. 12. 
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The method of monthly means for the determination of the seasonal 
variation in/(<) is described in the following paragraph: 

Form a new January entry by taking the arithmetic mean of all the 
January entries over the k given years; form a February entry by taking 
the arithmetic mean of all February entries; . . . form a De- 
cember entry similarly from the December entries over the k years. 
This series of twelve monthly means is then to be taken as our best 
approximation to the values which f{t) would have had in all years at 
the corresponding months if no causes had been influencing the values 
except those which lead to seasonal variation. 

A seasonal variation is defined as a periodic change with the period 
one year. The graph of a seasonal variation over a period of one year 
is not necessarily a single graceful wave with a well-defined crest and 
trough. The essential feature of a strictly seasonal variation is that its 
numerical value is the same, at corresponding months, in all years. A 
seasonal variation thus yields the same values for the Januarys of all 
years, and similar constant results for the other months. Hence, it is 
obvious that the following theorem is true: 

Theorem (1). If fit) actually is a periodic function whose period is 
one year, the monthly entries obtained by our method are exactly the values 
of fit) at the corresponding months. 

The second theorem we shall state is designed to show exphcitly the 
exact power of the method. Let Pit) represent the periodic function, 
with the period one year, whose values for all Januarys, Februarys, 
etc., are the corresponding monthly means obtained above. Thus, 
P(0)=P(12)=P(24), etc.,=the January mean; P(3) =P(15) =P(27), 
etc., = the April mean, etc. 

Theorem (2). Let fit) be any function of the time t known from t = 
to t= 12fc, that is, over a period of k years. Then, the sum of the squares 
of the residuals [fit) —Pit)], for all values of t, or 

_ (1) [/(0)__P(0)?+[/(l)-P(l)p+ . . . +[fil2k-l)-Pil2k-l)]^ 
is smaller in value than it would be if any other periodic function with 
period one year were used in place of Pit). 

It is seen that Theorem (2) shows Pit) to be the answer to the 
following problem: 

(2) To determine a periodic function Pit) which, in the sense of the 
method of least squares, is the best approximation to fit). 

The comparison of (2) with the theorem of the introduction shows 
the analogy between the method of monthly means and the method by 
which we determine a secular trend. The proof of Theorem (2) is 
extremely simple. It is well known that the sum of the squares of the 
deviations of a group of quantities from their arithmetic mean is less 
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than the sum of the squares of the deviations of the quantities from 
any number other than the arithmetic mean. The sum (1) consists 
of 12k terms which can be thought of as the sum of 12 groups of k 
terms each, one group corresponding to each month of the year. The 
January group, for example, is k times the sum of the squares of the 
deviations of the January entries about their arithmetic mean, which is 
P(0). Hence, the January group in (1) is smaller in value than it 
would be if any other number were substituted in place of P(0). 
Similar reasoning applies to the other eleven monthly groups in (1). 
Hence, the sum (1) is smaller in value than it would be if any other 
periodic function, with the period one year, were used in place of P{t). 

TABLE I 



Val.ofi 


Val. of/(() 


Val. of ( 


Val. of/(0 


Val. of ( 


Val. of/(() 


Val. of t 


Val. of/(() 





0.0 


30 


-7.1 


60 


0.0 


90 


0.0 


1 


2.3 


31 


-9.0 


61 


2.1 


91 


-2.1 


2 


4.3 


32 


-10.4 


62 


4.0 


92 


-4.0 


3 


5.8 


33 


-11.2 


63 


5.4 


93 


-5.4 


4 


6.7 


34 


-11.4 


64 


6.2 


94 


-6.2 


5 


7.1 


35 


-10.9 


65 


6.5 


95 


-6.5 


6 


7.1 


36 


-10.0 


66 


6.5 


96 


-6.5 


7 


7.0 


37 


-8.9 


67 


6.4 


97 


-6.4 


8 


6.9 


38 


-7.9 


68 


6.4 


98 


-6.4 


9 


7.2 


39 


-7.2 


69 


6.9 


99 


-6.9 


10 


7.9 


40 


-6.9 


70 


7.8 


100 


-7.8 


11 


8.9 


41 


-7.0 


71 


9.0 


101 


-9.0 


12 


10.0 


42 


-7.1 


72 


10.5 


102 


-10.5 


13 


10.9 


43 


-7.1 


73 


11.8 


103 


-11.8 


14 


11.4 


44 


-6.7 


74 


12.7 


104 


-12.7 


15 


11.2 


46 


-5.8 


75 


13.0 


105 


-13.0 


16 


10.4 


46 


-4.3 


76 


12.7 


106 


-12.7 


17 


9.0 


47 


-2.3 


77 


11.8 


107 


-11.8 


18 


7.1 


48 


0.0 


78 


10.5 


108 


-10.5 


19 


5.1 


49 


1.0 


79 


9.0 


109 


-9.0 


20 


3.3 


50 


1.7 


80 


7.8 


110 


-7.8 


21 


1.8 


51 


2.0 


81 


6.9 


111 


-6.9 


22 


0.9 


52 


1.7 


82 


6.4 


112 


-6.4 


23 


0.3 


53 


1.0 


83 


6.4 


113 


-6.4 


24 


0.0 


54 


0.0 


84 


6.5 


114 


-6.5 


25 


-0.3 


55 


-1.0 


85 


6.5 


115 


-6.5 


26 


-0.8 


56 


-1.7 


86 


6.2 


116 


-6.2 


27 


-1.8 


57 


-2.0 


87 


5.4 


117 


-5.4 


28 


-3.3 


58 


-1.7 


88 


4.0 


118 


-4.0 


29 


-5.1 


59 


-1.0 


89 


2.1 


119 


-2.1 



Let us apply the method of monthly means to an example arranged 
so as to forestall a certain objection which might be raised against the 
process. Let the values of our data/(<) be as given in Table I, where 
k = 10. The data have already been corrected for secular trend. 
The monthly means are found to be : 



Jan. 


Feb. 


March 


April 


May 


June 


July 


Aug. 


Sept. 


Oct. 


Nov. 


Dec. 


0.0 


1.0 


1.7 


2.0 


1.7 


1.0 


0.0 


-1.0 


-1.7 


-2.0 


-1.7 


-1.0 
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In this case the function P(t) which we determine as representing the 
seasonal variation has the monthly values listed in the table. The 
standard deviation of the residual function /(<) — P(<) is found to be 
approximately 7. At first thought, one might consider this large 
value as an indication that our periodic function P{t) was a poor 
approximation to the true seasonal variation. As a matter of fact, 
our function P{t) in this case is exactly equal to the true seasonal varia- 
tion, for f{t) was computed from the following formulas : 

/(0=2sin (— .360° j + 10 sin f— .360° j, from t= to t= 48, 

/(0=2sin it .30°), from t = 4,8 to t= 60, 

fit) = 2 sin it .30°) + 11 sin ( — .360° ), from 1 = 60 to t= 120. 

\60 / 

In other words, the fit) we are working with is the result of superim- 
posing a varying long term oscillation on the seasonal variation given 
by 2 sin it .30°). Evidently, this problem shows that the size of the 
standard deviation oi fit)— Pit) cannot be taken as a criterion of the 
applicability of the method. 

The question naturally arises as to what is a proper criterion of ap- 
plicability. The following theorem furnishes us with a theoretical 
answer: 

Theorem (3). The method of monthly means gives us the actual 
monthly values of the seasonal variation in case fit) is made up of the 
following component parts: 

(A) A seasonal variation, strictly periodic throughout the period of 
years under consideration. 

(B) A long term variation which consists of certain independent pieces, 
each extending over a whole number of years, where each piece represents 
a whole number of complete oscillations of a corresponding periodic func- 
tion whose period is an integral number itwo or greater) of years. 

(C) A second, a third, etc., long-term variation having the characteristics 
specified in (B) . 

The language of (A) and (B) is geometrical in nature, each compo- 
nent oifit) being thought of as a curve in the iy, t) plane. It is desired 
that a certain implicit agreement be understood in the statement of 
Theorem (3). As a consequence of the theory of Fourier series, it is 
known that a periodic function with the period 5 years, for example, 
may have component parts with the respective periods | of 5 years, 
I of 5 years, i of 5 years, i of 5 years, or 1 year, i of 5 years, etc. 
The agreement in our case is that the periodic functions referred to in 
(B) have no component parts of period one year. If the long-term varia- 
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tions originally had any one-year components, we are assuming that 
these were the same for all the variations and that this constant one- 
year oscillation was then thrown in under the variation mentioned 
in (A). 

An example in which the conditions of Theorem (3) are satisfied 
would be given, for instance, by a function f{t) defined for 17 years, 
consisting of 

1. A seasonal variation. 

2. A long-term variation which 

(a) for the first four years consists of two complete cycles of a 

periodic function with period two years, 

(b) for the next three years consists of one cycle of a periodic 

function whose period is three years, 

(c) for the next ten years consists of two complete cycles of a 

periodic function whose period is five years. 

3. A second long-term variation which 

(a) for the first five years consists of one cycle of a periodic func- 

tion whose period is five years, 

(b) for the next twelve years consists of four cycles of a periodic 

function whose period is three years. 
A certain fundamental property of sines and cosines ' makes Theorem 
(3) obvious to one familiar with the theory of Fourier series. Consider 
the particular function cos (t .15°), which has two years as its period 
of oscillation. Let h be any positive integer and let A be less than 
720° /h. Then the property referred to is the fact that 

(3)cosA-i-cos(A-i-6)-Fcos(A-i-26)-|- . . . +cos[A + {h-l)b] = 0, 

where b = 720° /h. In the averaging process of the method of monthly 
means we meet consequences of this property which is possessed by 
all sines and cosines. In determining the mean for any one month, 
we form a sum of the values of f{t), which, for February, would be 

(4)/(l)+/(l + 12)+ . . . +f[l + {k-l)l2]. 

The effect of the long-term variations (B) and (C) on the sum in (4) is 
zero, because of equations like (3) which would hold for each piece of 
(B). The seasonal variation, however, furnishes to (4) exactly fc times 
the value the variation takes on at February in all years. Hence, the 
monthly mean of f{t) for February is the exact value of the seasonal 
variation for this month. Similar reasoning applies in the case of 
the other monthly means, so that Theorem (3) is true. 

The statements of Theorems (2) and (3) justify us in applying the 
method of monthly means in the analysis of data which are affected 

iBdcher, Anritds of Mathematics, Second Series, Vol. 7 (1906), p. 135, Formula (63). 
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by the business cycle. In such cases, it is true, the assumptions of 
Theorem (3) are not exactly satisfied. Nevertheless, there is enough 
similarity between the conditions of Theorem (3) and the actual con- 
ditions affecting the economic data to give us confidence in the results 
obtained by taking monthly means. Moreover, we always have the 
unquestionable truth of Theorem (2) as a foundation for our belief. 
The ultimate test of any method in statistics is, of course, the degree 
of success it attains in actual practice. This test has not as yet been 
applied to any great extent in the case of the present method, but it is 
hoped that its simplicity and ease of application will permit such tests 
in the future. One application to a set of data is treated in the next 
part of this paper in connection with a comparison of the method used 
by Persons and that of monthly means. 



3. COMPAHISON OF THE METHOD OF MONTHLY MEANS WITH THAT OF 

PEOPESSOB PERSONS 

Consider the data given in Table II. This set of values was com- 
puted by means of the following formulas : 

/(0 = lo+sin<(30°)+4sin<(10°), from t= to t= 36, 
/(0 = 15+sin<(30°)+6sin<(10°), fromt = dQtot= 72, 
/(0 = 15+sin t(S0°)+2 sin <(10°), from t = 72 to <=108. 

We have in this case data over 9 years which are the result of com- 
pounding the seasonal variation 15+sin <(30°) with the long-term 
variations given, respectively, by the expressions 4 sin i(10°), 6 sin 
t{10°), and 2 sin t{10°). The conditions of Theorem (3) of the previous 
section are satisfied, and, therefore, the method of monthly means 
leads us to the exact values of the seasonal variation, which are given 
in the following table: 



Jan, 


Feb. 


March 


April 


May 


June 


July 


Aug. 


Sept. 


Oct, 


Nov. 


Dec. 


15.00 


15.50 


15.87 


16.00 


15.87 


15.50 


15.00 


14.50 


14.13 


14.00 


14.13 


14.50 



Let us consider obtaining the values of the seasonal variation by 
Persons' method. First, we are required to compute the ratios of 
each entry in the f{t) table to the preceding entry. The results of 
this computation are arranged in monthly columns in Table III. 
Then, in the resulting series of relatives for each of the twelve months, 
we pick out the median values, which are listed in Table IV. In 
using the method under consideration, the assumption now is made 
that the seasonal variation causes, in the average, each February 
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value to be 1.018 (Table IV) times the preceding January value, each 
March value to be 1.016 times the preceding February value, etc. It 
is not our present purpose to consider the non-mathematical arguments 
in justification of this assumption. However, it must be admitted 
that the assumption appears to be sensible and would be very accepta- 
ble if no logically-founded method were available and if the results to 
be found below were not on hand as contrary evidence. Instead of 
considering an argument, let us compare the results of this method 
with the actual values of the seasonal variation which should be ob- 
tained in the present instance. 

TABLE II 
TABULATION OF /(() BY MONTHS 



Year 


Jan. 


Feb. 


March 


April 


May 


June 


July 


Aug. 


Sept. 


Oct. 


Nov. 


Deo. 


Ist 

2d 


15.000 
18.464 
11.536 
15.000 
20.196 
9.804 
15.000 
16.732 
13.268 


16.196 
18.564 
11.740 
16.544 
20.096 
9.860 
15.848 
17.032 
13.620 


17.234 
18.438 
11.926 
17.918 
19.724 
9.956 
16.560 
17.152 
13.896 


18.000 
18.000 
12.000 
19.000 
19.000 
10.000 
17.000 
17.000 
14.000 


18.438 
17.234 
11.926 
19.724 
17.918 
9.956 
17.152 
16.550 
13.896 


18.564 
16.196 
11.740 
20.096 
16.544 
9.860 
17.032 
15.848 
13.620 


18.464 
15.000 
11.536 
20.196 
15.000 
9.804 
16.732 
15.000 
13.268 


18.260 
13.804 
11.436 
20.140 
13.456 
9.904 
16.380 
14.152 
12.968 


18.074 
12.766 
11.562 
20.044 
12.082 
10.276 
16.104 
13.4,W 
12.848 


18.0Q0 
12.000 
12.000 
20.000 
11.000 
11.000 
16.000 
13.000 
13,000 


18.074 
11.562 
12.766 
20.044 
10.276 
12.082 
16.104 
12.848 
13.450 


18.260 
11.436 


3d 


13.804 


4th 

5th 

6th 

7th 

8th 

9th 


20.140 
9.904 
13.456 
16.380 
12.968 
14.152 



TABLE III 
TABULATION OF RELATIVES BY MONTHS 



Year 


Jan. 


Feb. 


March 


April 


May 


June 


July 


Aug. 


Sept. 


Oct. 


Nov. 


Deo. 


Ist 


1.0601 
1.011 
1.009 
1.087 
1.003 
.990 
1.114 
1.021 
1.023 


1.800 
1.005 
1.017 
1.103 
.995 
1.006 
1.057 
1.018 
1.026 


1.064 
.994 
1.016 
1.083 
.981 
1.010 
1.044 
1.007 
1.021 


1.045 

.976 

1.006 

1.060 

.963 

1.004 

1.027 

.991 

1.007 


1.024 
.957 
.994 

1.038 
.943 
.996 

1.009 
.974 
.993 


1.006 
.940 
.984 

1.019 
.923 
.990 
.993 
.958 
.980 


.995 
.926 
.983 
1.005 
.907 
.994 
.982 
.946 
.974 


.989 
.920 
.991 
.997 
.897 
1.010 
.979 
.943 
.977 


.990 
.925 

1.010 
.995 
.897 

1.038 
.983 
.951 
.991 


.996 
.940 

1.038 
.998 
.910 

1.070 
.994 
.966 

1.011 


1.004 

.963 

1.064 

1.002 

.935 

1.098 

1.006 

.989 

1.035 


1.010 


2d 


.990 


3d 


1.081 


4th . . . 


1.005 


5th 

6th 

7th 

8th 

9th 


.963 
1.114 
1.017 
1.009 
1.052 



1 January, 10th year. 

From the monthly medians of the link relatives listed in Table IV 
we obtain' a corresponding series of relatives on January as a base. 
For example, the March entry is found by computing (1.018) (1.016) = 
1.034, the April value is (1.006) (1.034) = 1.040, etc., as given in 
Table IV. By this method we obtain for January itself the value 
(1.021) (.978) = .999, instead of 1.000. This discrepancy ' is spread 
over the whole year, giving the final adjusted values found in Table 
IV. For purposes of comparison the true values of the monthly rela- 
tives of the seasonal variation have been computed on January as a 
base and are listed in Table IV. 

1 Persons, loc. cit., p. 31. 
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To find actual numerical values of the seasonal variation from the 
final adjusted relatives obtained by Persons' method, we compute the 
mean of all the entries in the original Table II, which is found to be 15. 
Then we multiply in succession by the monthly relatives. The results 
of this computation are given in Table IV, with the true values of the 
seasonal variation listed simultaneously for purposes of comparison. 
The true monthly increments, or decrements, of the seasonal variation 
and those of the computed variation are likewise given in Table IV. 

TABLE IV 



.J=i 

a 
o 




o o 
.S§S 

S4 


i1 
■it 

If 


Actual 
monthly rela- 
tives on Janu- 
ary as base 


§1.§ 

III 




£ g £ 

A< a c 


_ >. a 

HI 

o S £ 


January 

February 

March 

April 

May 

June 

July 


1.021 

1.018 

1.016 

1.006 

.994 

.984 

.982 

.979 

.990 

.996 

1.004 

1.010 


.999 

1.0^8 

1.034 

1.040 

1.034 

1.018 

.999 

.978 

.969 

.965 

.969 

.978 


1.000 

1.018 

1.034 

1.041 

1.035 

1.018 

1.000 

.979 

.969 

.966 

.970 

.979 


1,000 

1.033 

1.058 

1.067 

1.058 

1.033 

1.000 

.967 

.942 

.933 

.942 

.967 


15.00 
15.27 
15.52 
15.61 
15.52 
15.27 
15.00 
14.69 
14.54 
14.48 
14.53 
14.99 


15.00 
15.50 
15.87 
16.00 
15.87 
15.50 
15.00 
14.59 
14.13 
14.00 
14.13 
14.50 


.31 

.27 

.25 

.09 

-.09 

-.25 

-.27 

-.31 

-.15 

-.06 

.05 

.16 


.50 

.50 

.37 

.13 

-.13 

-.37 

— 50 


August 

September 

October 

November .... 
December .... 


-.50 

-.37 

-.13 

.13 

.37 



The conclusions to be drawn from the above problem are obvious. 
By Persons' method we are led through the computation of 108 rela- 
tives, the determination of twelve medians, the reduction of these 
medians to January as a base, the adjustment of these final values to 
distribute a certain discrepancy over the whole year, and then to the 
transfer from these adjusted values to actual numerical values of the 
seasonal variation. Finally, we discover that the error in the monthly 
decrements or increments is generally about 50 per cent. On the other 
hand, the method of monthly means leads to the correct numerical 
values of the seasonal variation after the easy computation of twelve 
monthly means. 

The actual statistical series in which we search for a seasonal varia- 
tion may differ from the simple case we have treated above, but they 
differ in that they are much more complicated. If Persons' method 
does not yield the correct results in a simple case, we must logically 
consider that its applicability in a complicated instance is open to 
question. 



